Voicing features for robust speech detection

نویسندگان

  • Trausti T. Kristjansson
  • Sabine Deligne
  • Peder A. Olsen
چکیده

Accurate speech activity detection is a challenging problem in the car environment where high background noise and high amplitude transient sounds are common. We investigate a number of features that are designed for capturing the harmonic structure of speech. We evaluate separately three important characteristics of these features: 1) discriminative power 2) robustness to greatly varying SNR and channel characteristics and 3) performance when used in conjunction with MFCC features. We propose a new features, the Windowed Autocorrelation Lag Energy (WALE) which has desirable properties.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A linked-HMM model for robust voicing and speech detection

We present a novel method for simultaneous voicing and speech detection based on a linked-HMM architecture, with robust features that are independent of the signal energy. Because this approach models the change in dynamics between speech and non-speech regions, it is robust to low sampling rates, significant levels of additive noise, and large distances from the microphone. We demonstrate the ...

متن کامل

Real time robust speech detection for text independent speaker recognition

Speaker recognition systems employ a speech detection algorithm and use only frames detected as speech for further processing. The accuracy obtained by a speaker recognition system depends on the method that is used to detect speech, in particular for real-life deployments where the incoming speech varies significantly in loudness and noise characteristics. Also, actual deployments mandate real...

متن کامل

Noise robust digit recognition using a glottal radar sensor for voicing detection

A voicing feature is used in concatenation to MFCC features to increase the performance of digit recognition at both low and high SNRs. The problem of noise robust extraction of the voicing feature is solved by using the glottal electromagnetic sensor (GEMS). The GEMS device provides reliable voicing information at all SNRs and noise environments. It is shown that although the voicing feature i...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

All for one: feature combination for highly channel-degraded speech activity detection

Speech activity detection (SAD) on channel transmissions is a critical preprocessing task for speech, speaker and language recognition or for further human analysis. This paper presents a feature combination approach to improve SAD on highly channel degraded speech as part of the Defense Advanced Research Projects Agency’s (DARPA) Robust Automatic Transcription of Speech (RATS) program. The key...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005